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PROGRAMMABLE DELAY CIRCUIT WITHIN A CONTENT 
ADDRESSABLE MEMORY 

FIELD OF THE INVENTION 

[0001] This invention relates to the field of memory devices and, in 
particular, to content addressable memory devices. 
BACKGROUND OF THE INVENTION 

[0002] One issue facing memory device designers today is the problem 
associated with a clock signal being distributed throughout the integrated 
memory circuit die while maintaining minimal clock skew. Clock signals are 
used to control the times at which component operations occur in a digital 
circuit. Clock skew is the time difference between clock signal edges arriving at 
different areas (e.g., different components) of an integrated circuit. Minimizing 
clock skew is important because digital logic circuits within memory devices 
require precise clocking for proper operation of the entire device (e.g., the 
outputting of data). Ideally, synchronous memory devices should have clock 
signals arriving simultaneously to all circuit components within the device that 
operate on the same clock period and same clock edge. In practice, the delay 
through a clock signal path should not be more than the interval between one of 
the edges of the clock signal and a following edge of the clock signal. As such, if 
there is a substantial amount of clock skew within a device, some components 
may not receive an edge of a clock signal before other components receive a 
subsequent edge based on the frequency of the clock signal. This prevents some 
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components from operating at designated times relative to other components, 
and can cause the device to malfunction. As such, clock skew may limit the 
maximum clock frequency that a device may operate at because the device needs 
to be designed to accommodate worst case clock skew and still operate properly. 
[0003J One of the causes of clock skew within an integrated circuit is that 
the impedance, or resistance-capacitance (RC), of the traces that route the clock 
signal to different areas of the device generate a delay in the clock signal. Other 
contributors to clock skew are delays due to passing the clock signal through 
pads and input buffers and the loading of the various registers that are driven by 
the clock signal, as illustrated in Figure 1 A. The total clock signal input to data 
output (Dout) delay (Tcd) for the exemplary path illustrated in Figure 1 A is equal 
to the input buffer delay + the clock (CLK) buffer delay + RC delay of the metal 
trace + output register delay + the output buffer delay. Some exemplary values 
of the various delays may be 0.5 nanoseconds (ns) for the input buffer delay, 0.5 
ns for the clock buffer delay, 1.5 ns for the trace delay, 0.5 ns for the output 
register delay, and 1.5 ns for the output buffer delay resulting in a Tcd of 4.5 ns. 
[0004] The Dout of the memory integrated circuit may be provided to other 
components (e.g., component B) that are connected with the memory integrated 
circuit on one or more printed circuit boards (PCB), as illustrated in Figure IB. 
The memory integrated circuit and component B may both be timed to operate 
based on the same clock signal. However, components typically require a set-up 
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time (Tsu) in order to operate properly. The setup time is the minimum time 
needed for D ou t to be applied at the input of component B before component B is 
triggered by the clock signal to perform a designated operation. An exemplary 
setup time may be approximately 2 ns. As such, component B would need to 
receive Dout from the memory circuit at least 2 ns before a subsequent clock edge 
of the clock signal that triggers component B to perform its operation. 
Continuing the example above, if the memory integrated circuit has a Tcd of 4.5 
ns (e.g., in a read operation of the memory array), then D ou t would be provided 
to component B after approximately 4.5 ns. If the clock signal has a frequency of, 
for example, 100 MHz (i.e., the time period between clock signal edges is 10 ns), 
then there would be sufficient time (10ns - 4.5ns=5.5ns) to allow for the setup (2 
ns) of component B with the clock skew of 4.5ns. However, if the frequency of 
the clock signal is increased to 200 MHz, as illustrated in Figure 1C, then the time 
period between triggering clock edges is 5 ns. With a Tcd of 4.5 ns, then Dout 
would not be provided to component B in enough time (short by a delta of 1.5ns) 
to allow for the 2 ns setup time for component B before a subsequent edge of the 
200 MHz clock signal triggers component B to operate using Dout. Thus, a 
subsequent clock edge would be used to clock component B. This will decrease 
the overall throughput of the system incorporating components A and B. With 
devices operating at increasing frequencies, clock skew poses an increasing 
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problem. Moreover, as the level of integration in a memory device increases, 
clock skew due to the above noted contributing factors becomes even greater. 
[0005] Figure 2A illustrates prior art solutions to clock skew in 
synchronous random access memory (SRAM) devices. One solution is to utilize 
a phase locked loop (PLL) to generate an internal clock signal that is 
synchronized with the original reference clock signal and then use the internal 
clock signal to drive output registers of the SRAM. A typical PLL contains a 
voltage controller oscillator (VCO) to generate the internal clock signal having a 
fixed amount of delay with respect to the reference clock signal. The PLL also 
contains a phase detector to measure the phase difference between the reference 
clock signal and the internal clock signal. The measured difference drives a 
charge pump to raise and lower the voltage level of a loop filter. The loop filter 
provides a stable voltage input to the VCO. Because the frequency of reference 
clock signal may vary over time, these differences are provided back to the phase 
detector and used to lock the frequency of the internal clock signal to the 
reference clock signal. If, for example, the frequency of the reference clock signal 
shifts slightly, the phase difference between the VCO signal and reference clock 
signal will begin to increase with time. This changes the control voltage on the 
VCO in such a way as to bring the VCO frequency of the internal clock signal 
back to the value of the reference clock signal. Thus, the loop maintains lock 
when the reference clock signal frequency varies. 
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[0006] The resulting internal clock signal is phase shifted from the 
reference clock signal such that both clock signals have the same frequency but 
the triggering edges of the internal clock signal is delayed with respect to the 
triggering edges of the reference clock signal The internal clock signal is 
generated during power-up of the SRAM before any circuit operations are 
performed. As such, although the internal clock signal is delayed from the 
reference clock signal, the output registers may trigger off of a later clock edge of 
the free running internal clock signal that exists earlier in time than the skewed 
reference clock signal edge, as illustrated in Figure 2B. In this manner, the clock 
signal input to data output delay associated with the path illustrated in Figure 
lAis reduced. 

[0007] For high speed designs running at clock frequencies greater than, 

for example, 200 megahertz (MHz) (cycle time of 5 ns), Tcd parameter may need 
to be very small (e.g., on the order of 1-2 ns). One solution for minimizing this 
parameter is the use a delay locked loop (DLL) to synchronize an internally 
generated clock signal with the reference clock signal and use the internal clock 
signal to drive output registers of the SRAM. A typical DLL includes a phase 
detector that measures the phase difference between the reference clock signal 
and the internally generated clock signal. The phase detector drives a shift 
register that causes stored data to shift positions based on the difference in 
signals. The shift register is coupled to a delay line to produce a phase-adjusted 
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clock signal by sequentially delaying the internal clock signal according to the 
shift register data. The internal clock signal is fed back to the phase detector for 
comparison with the reference clock signal. As with the PLL, when the reference 
clock signal and the internal clock signal are the same, the DLL is locked onto the 
reference clock signal. As such, a feedback relationship is used to generate and 
maintain the internal clock signal with both the PLL and the DLL, 
[00081 One problem with using a PLL is that the phase detector, loop 
filter, and VCO are typically analog components that have poor stability and 
performance in noisy digital switching environments. Similar problems may 
exist with the components used in a DLL. As such, it may not be desirable to use 
a PLL or DLL in content addressable memory devices that are typically more 
noisy than SRAM devices due to simultaneously comparing data with many 
CAM cells in the CAM array. In addition, the analog components used in a 
PLL/ DLL utilize separate power and ground supplies that typically use higher 
voltages (e.g., 2.5V -3.5V) than digital components (e.g., 1.2V). Moreover, PLL 
and DLL components may only be able to operate in a fixed frequency range, 
thereby limiting their versatility. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

[0009] The present invention is illustrated by way of example and not 
intended to be limited by the figures of the accompanying drawings. 
[0010] Figure 1 A illustrates exemplary components contributing to clock 
skew in an integrated circuit. 

[0011] Figure IB illustrates an exemplary system having components 
configured to operate with a common clock signal. 

[0012] Figure 1C is a timing diagram illustrating an exemplary operation 

of the system of Figure IB. 

[0013] Figure 2A illustrates a prior art SRAM with a PLL/ DLL. 

[0014] Figure 2B is a timing diagram illustrating the relationship between 

clock signals. 

[0015] Figure 3 illustrates one embodiment of a line card or blade of a 

router having a CAM device configured to perform concurrent lookups. 
[0016] Figure 4 illustrates one embodiment of a CAM device having a 
programmable delay circuit. 

[0017] Figure 5 A illustrates one embodiment of a programmable delay 
circuit. 

[0018] Figure 5B is a timing diagram illustrating the relationship between 

a reference clock signal and a internal delayed clock signal. 
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[0019] Figure 6 illustrates an alternative embodiment of an integrated 
circuit having multiple programmable delay circuits. 

(0020] Figure 7 illustrates one embodiment of a delay element used within 
a programmable delay circuit. 

[0021] Figure 8 is an exemplary table of the programming values for 
selecting a particular delay element based on the frequency of a reference clock 
signal 

[0022] Figure 9 is an exemplary illustration of the delayed internal clock 
signal requirement over different frequencies of the reference clock signal. 
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DETAILED DESCRIPTION 

[0023] In the following description, numerous specific details are set forth 

such as examples of specific, components, circuits, delays, etc. in order to provide 
a thorough understanding of the present invention. It will be apparent, 
however, to one skilled in the art that these specific details need not be employed 
to practice the present invention. In other instances, well known components or 
methods have not been described in detail in order to avoid unnecessarily 
obscuring the present invention. 

[0024] The term "coupled" as used herein means connected directly to or 
indirectly connected through one or more intervening components or circuits. 
Additionally, the interconnection between components, circuits, blocks, etc. , 
may be shown as buses or a single signal lines. Each of the buses may 
alternatively be a single signal line and each of the single signal lines may 
alternatively be buses, 

[0025] Embodiments of the present invention include various method 
steps, which will be described below. The steps may be performed by hardware 
components or may be embodied in machine-executable instructions, which may 
be used to cause hardware components (e.g., a processor, programming circuit) 
programmed with the instructions to perform the steps. Alternatively, the steps 
may be performed by a combination of hardware and software. 
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[0026] Embodiments of the present invention may be provided as a 
computer program product, or software, that may include a machine-readable 
medium having stored thereon instructions. The machine readable medium may 
be used to program a computer system (or other electronic devices) to generate 
articles (e.g., wafer masks) used to manufacture embodiments of the present 
invention. The machine-readable medium may include, but is not limited to, 
floppy diskettes, optical disks, CD-ROMs, and magneto-optical disks, ROMs, 
RAMs, EPROMs, EEPROMs, magnet or optical cards, flash memory, or other 
type of media / machine-readable medium suitable for storing electronic 
instructions. 

[0027] The machine readable medium may store data representing an 
integrated circuit design layout that includes embodiments of the present 
invention. The design layout for the integrated circuit die may be generated 
using various means, for examples, schematics, text files, gate-level netlists, 
hardware description languages, layout files, etc. The design layout may be 
converted into mask layers for fabrication of wafers containing one or more 
integrated circuit dies. The integrated circuit dies may then be assembled into 
packaged components. Design layout, mask layer generation, and the 
fabrication and packaging of integrated circuit dies are known in the art; 
accordingly, a detailed discussion is not provided. 
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[0028] The method and apparatus described herein provides for a 
programmable delay circuit to compensate for reference clock signal skew in 
integrated circuits such as a content addressable memory (CAM). In one 
embodiment, the apparatus includes a programmable delay circuit coupled 
between a clock pad and an output register. The programmable delay circuit 
includes a programmable register and a delay selection circuit. The delay 
selection circuit includes a plurality of delay elements coupled to a multiplexer. 
The programmable register is coupled with the first multiplexer. In one 
embodiment, a user may program the programmable register during device 
initialization with time period information for delaying a reference clock signal 
received at the clock pad. During device operation, the time period information 
is decoded and transmitted to the multiplexer to select from among delayed 
clock signals generated by the plurality of delay elements. The selected delayed 
clock signal is provided to clock the data output from the output register. 
[0029] In another embodiment, the apparatus may also include a second 

programmable delay circuit coupled between the clock pad and the output 
register. The second programmable delay circuit may be used to fine tune the 
desired delay for the internal clock signal clocking the output register. The 
desired delay may be fine tuned, for example, based on a particular frequency of 
operation of the reference clock signal. The second programmable delay circuit 
may be coupled to its own programmable register to receive the fine tuning time 
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period information to select among a plurality of delayed clock signals generated 
by a plurality of delay elements in the second programmable delay circuit. 
Alternatively, different bit positions of the same programmable register may be 
coupled to the first and second delay selection circuits. 
[0030] The use of programmable delay circuits as discussed above may 
enable integrated circuits and, in particular, CAM devices to operate at higher 
frequencies than would otherwise be possible. By using a programmable 
register to select from among clock signals that are phase shifted by delay 
elements, a delayed internal clock signal, relative to a reference clock signal, may 
be generated without the use of feedback circuit as with prior art solutions. This 
may provide for the generation of a more precise delayed internal clock signal 
across various frequencies of operation of the reference clock signal. 
[0031] The programmable delay circuits discussed herein may be 
particularly useful in CAM devices that may typically be noisier than other types 
of memory devices such as an SRAM. For example, during a compare operation, 
the entire CAM array may be active, thereby generating a large amount of 
current switching that would have an adverse effect on a PLL or DLL if they 
were operated with the same power supplies as the CAM array. Although 
separate power supplies may be used for a PLL/ DLL and other components 
within a device, such separation would require extensive design work and die 
area in order to implement, which may not be feasible. Additionally, shielding a 
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PLL or DLL from the substrate noise generated by the simultaneous switching of 
the CAM circuits can be difficult. Moreover, the programmable delay circuits 
discussed herein may use the same power supply as the CAM array and this will 
enable the programmable delay circuit to scale in operating voltage with the 
CAM array as fabrication processes scale to smaller and smaller geometries. 
[0032] The programmable delay circuits discussed herein also are not 
limited to a particular operating frequency range, thus increasing the 
applications in which the programmable delay circuits may be used. 
[0033] Figure 3 illustrates one embodiment of a line card or blade of a 
router including a CAM device having a programmable delay circuit Line card 
300 includes processor 310, ingress interface circuitry 330, egress interface 
circuitry 340, CAM device 320, associated data storage unit 370, traffic manager 
360, and payload storage unit 350. 

[0034] Processor 310 functions to control the overall operation of line card 

300 in cooperation with the other components of line card 300. For example, 
processor 310 receives packets from a network medium through ingress interface 
circuitry 330, stores the payload of packets in payload storage unit 350, and 
processes packet header information to determine required lookups in CAM 
device 320 and subsequent handling of the packets, as discussed herein. Ingress 
circuitry includes, for example, PHY and MAC devices. Processor 310 sends out 
packets on a network medium through egress interface circuitry 340 based on the 
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lookups performed by CAM device 320. Egress interface circuitry 340 may be 
connected to a switch fabric or directly to one or more other routers or switches. 
Processor 310 may be one or more network processor units (NPUs), 
microprocessors, or one or more special purpose processors such as a digital 
signal processor (DSP). In another embodiment, processor 310 may be another 
type of controller, for example, a field programmable gate array or a general 
purpose processor. The processor 310, ingress interface circuitry 330, and egress 
interface circuitry 340 components of a router are known in the art; accordingly, 
a detailed discussion is not provided. 

[0035] In response to information in a packet header, for a particular 
packet, processor 310 determines the number and types of lookups to be 
performed by one or more of CAM devices 320, and forms the search keys for 
these lookups. The searches or lookups may include, for example, Classification 
lookups, forwarding lookups (e.g., Next Hop or longest prefix match (LPM) 
lookup), MAC lookup, MPLS lookup, etc. For one example, when multiple 
searches are required, processor 310 forms a composite search key that includes 
at least two, and as many as all, of the various search keys for the lookups. The 
composite search key may be provided as a common input string to CAM device 
320. For one embodiment, CAM device 320 selectively identifies and extracts the 
individual search keys from the input string and provides the individual search 
keys to the associated CAM blocks to perform the lookups. Advantageously, the 
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lookups can then occur concurrently or simultaneously in the CAM blocks of 
CAM device 320, thereby increasing overall throughput over conventional 
systems in which searches are processed sequentially. 

[0036] CAM device 320 may be a multiple block CAM device with each 

block capable of storing a different table for comparand lookups. Alternatively, 
CAM device 320 may represent multiple, single block CAM devices (e.g., with 
each single block CAM device formed on a different integrated circuit substrate) 
with each CAM device used to store a different table for comparand lookup. 
After one or more lookups are executed in CAM device 320, associated 
information for matching entries (e.g., additional routing information and/ or 
packet information) may be retrieved from associated data unit 370. Processor 
310 then communicates with traffic manager 360 to schedule the exit of a packet 
from line card 300 via egress interface circuitry 340. 

[0037] Figure 4 illustrates one embodiment of a CAM device having a 

programmable delay circuit. In one embodiment, CAM device 420 may be the 
CAM device 320 of Figure 3. CAM device 420 includes instruction decoder 406, 
comparand register 408, address logic 404, CAM array 402, priority encoder logic 
412, read/ write circuitry 416, match flag logic 414, programmable delay circuit 
415, clock pad 431, input buffer 432, and output registers 422, 424, and 426. 
[0038] CAM array 402 includes CAM cells that may be of any type of 

CAM cells, including NAND and NOR based binary, ternary or other CAM cells, 
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that may be formed from either volatile or non-volatile elements and include at 
least one storage element and one compare circuit. CAM device 420 may be 
instructed, for example, by processor 310 of Figure 3 to compare a search key, 
also referred to as a comparand (e.g., packet header data), with data stored in its 
associative memory array 402. Multiple entries in the CAM array are 
simultaneously compared with all, or a portion, of the search key to identify 
matching entries. 

[0039] Instruction decoder 406 decodes various instructions provided on 

instruction bus IBUS. The instructions may include instructions to write data to 
CAM array 402, read data from CAM array 402, and to compare comparand data 
with data stored in the CAM array 402. The comparand data may be provided 
on comparand bus CBUS and stored in comparand register 408 or directly 
provided to CAM array 402. The CAM device 420 may also include one or more 
global mask registers (not shown) for the comparand data provided to CAM 
array 402. 

[0040] Data is written to CAM array 402 by read/ write circuitry 416 and 

address logic 404. Address logic 404 selects one or more rows of CAM cells in 
response to an address. The write data is provided to the selected cells (e.g., over 
one or more data bit lines) by the write portion of read/ write circuitry 416 (e.g., 
write buffers). Data is read from one or more selected rows of CAM cells by the 
read portion of read/write circuitry 416 (e.g., by a sense amplifier). 
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[0041] Instruction decoder receives a reference clock signal that may be 
generated externally from CAM device 420 and applied to clock pad 431. The 
reference clock signal may be received from, for example, processor 310 of Figure 
3. The instruction decoder 406 provides various control signals to address logic 
404, read/ write circuitry 416, and comparand register 408 to control when CAM 
device 420 performs one or more operations according to the timing of the 
reference clock signal. Additionally, instruction decoder 406 may provide one or 
more control signals to CAM array 402, priority encoder logic 412, match flag 
logic 414, programmable delay circuit 415 and/ or output registers 422, 424, and 
426 to enable these circuits to perform their associated functions at an 
appropriate time. 

[0042] When CAM array 402 is searched for a match of the stored data 

with the key (comparand data), each matching location indicates a match on one 
of match lines 403. Match flag logic 414 indicates the existence of a match if at 
least one of the match lines 403 carries a match signal indicating a match. 
Additional flag logic such as almost full flag logic, full flag logic, and/ or multiple 
match flag logic may also be included in CAM device 420. 
[0043] Priority encoder logic 412 translates a matched location(s) into an 

index (or a match address) and outputs this index to output register 422 (which 
may be accessed by processor 310 of Figure 3). Priority encoder logic 412 also 
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identifies which matching location has the top priority if there is more than one 
matching entry. 

[0044] CAM device 420 also includes a programmable delay circuit 415 
coupled between CLK pad 431 and the output registers 422, 424, and 426. 
Programmable delay circuit 415 generates an internal clock signal 439 that is 
delayed from reference clock signal 438 and is used to time the output of 
information stored in output registers 422, 424, and 426. Programmable delay 
circuit 415 may be used to minimize the clock skew between the clock signal 
triggering, for example, instruction decoder 406 and the output registers 422, 424, 
and 426. For example, clock skew may result from CLK pad 431 delay, input 
buffer 432 delay, CLK buffer 433 delay, and the RC delay of the conductive trace 
435, as well as, an output register delay and an output pad (not shown) delay. 
Without programmable delay circuit 415, the clock skew within CAM device 420 
may result in the output registers not clocking out an index, match flag, or data 
to satisfy the setup time of a follow-on device using the same clock signal source 
as CAM device 420. As such, the use of programmable delay circuit 415 may 
enable CAM device 420 to operate at higher frequencies than would otherwise be 
possible. The operation of programmable delay circuit 415 is discussed in 
greater detail below. 

[0045] It should be noted that in an alternate embodiment, CAM device 

420 may have other configurations. For example, the CAM device 420 may 



18 Attorney Docket No.: 002489.P035 



include multiple match flag logic to assert a multiple match flag. For another 
example, instruction decoder 406 may be omitted and various read, write and 
compare control signals and the clock signal provided directly to one or more of 
the circuit blocks. 

[0046] Programmable delay circuit 415 may be particularly useful in CAM 
device 420 since the CAM device 420 may be noisier than other types of memory 
devices due to simultaneously comparing data with many CAM cells in CAM 
array 402. Such a compare operation generates a large amount of current 
switching that would have less of an adverse affect on programmable delay 
circuit 415 than on a PLL/DLL if they were operated with the same power 
supply as CAM array 402. In addition, a lower voltage power supply may be 
used to power programmable delay circuit 415 than may otherwise be needed to 
power a PLL or DLL, and the voltage power supply for programmable delay 
circuit 415 may scale with that of other circuits, such as the CAM array, as 
fabrication processes scale to smaller and smaller geometries. Furthermore, 
programmable delay circuit 415 is not limited to a particular operating frequency 
range, thus increasing the applications in which programmable delay circuit 415 
may be used. 

[0047] Figure 5A illustrates programmable delay circuit 515 that is one 
embodiment of programmable delay circuit 415 of Figure 4. Programmable 
delay circuit 515 may also be used in other types of integrated circuits, for 
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element). Delay element D, 572 represents a predetermined amount of time 
period delay greater than zero and delay element 579 represents the greatest 
amount of time period delay. 

[0050] The internal clock signals generated by the delay elements have the 
same frequency as reference clock signal 438 but are phase shifted relative to 
reference clock signal 438. 

[00511 Multiplexer 550 opara.es to select from among the outputs of the 
d e,ay elements under the control of programmable register 560. Multiplexer 550 
wnsmits the se.ec.ed internal Cock signal 439 .o a clocking input 526 of output 
regis*. 422. Programmabte register 560 is coupled to decoder 555 that, in turn, 
is coupied to multiplexer 550. Programmable register 560 is an n-bit register that 
stores info— that is used to determine which output of delay elements 571- 
579 is to be selected by multiplexer 550. For other embodiments, programmable 
reg ister 560 may have a different number of bits. Decoder 555 decodes the 
formation stored in programmable register 560 and generates control signals 
that are — ed (e.g., via signal Hne(s) 556) to multiplexer 550 for selecting 
one of the outputs of the delay elements 571-579 based on the information stored 
in the programmable register 560. I. should be noted that although decoder 555 
is Ulustrated separately from multiplexer 550, the operations of decoder 560 may 
be incorporated into multiplexer 550 or performed by a separate circuit. 
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Similarly the operations of other illustrated components of the figures may be 
performed by separate circuits or incorporated within other circuits. 
[0052] The resulting internal clock signal 439 is phase shifted from 
reference clock signal 438 such that both clock signals have the same frequency 
but the triggering edges of the internal clock signal 439 are delayed with respect 
to the triggering edges of the reference clock signal 438, as illustrated in Figure 
5B. The internal clock signal 439 can then be used to clock one of the output 
registers in a sufficient amount, for example, to satisfy the set up time of a follow 
device that is clocked by reference clock signal 438 and receives a signal from one 
of the output registers. 

[0053] For another embodiment, multiplexer 550 may be replaced with a 
l-to-2 n demultiplexer that receives the reference clock signal 438 and, in response 
to decoder 555, selects one of the delay paths 571-579 to provide the internal 
clock 439 to the input of output register 425. 

[0054] Similar to that mentioned with respect to programmable delay 
circuit 415 of Figure 4, programmable delay circuit 515 may be particularly 
useful in a CAM device that may be noisier than other types of memory devices 
due to the simultaneous comparing of data with many CAM cells in a CAM 
array. Such a compare operation generates a large amount of current switching 
that would have less of an adverse affect on programmable delay circuit 515 than 
on a PLL/DLL if they were operated with the same power supply as used to 
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power a CAM array. In addition, a lower voltage power supply may be used to 
power programmable delay circuit 515 than may otherwise be needed to power a 
PLL or DLL, and the voltage power supply for programmable delay circuit 415 
may scale with that of other circuits, such as the CAM array, as fabrication 
processes scale to smaller and smaller geometries. Furthermore, programmable 
delay circuit 515 is not limited to a particular operating frequency range, thus 
increasing the applications in which programmable delay circuit 515 may be 
used. 

[0055] In one embodiment, programmable register 560 may be pre- 

programmed by a user (e.g., a line card manufacturer, router manufacturer, end 
customer) to select delay path 571 or a particular delay element 571-579 prior to 
operation of the device in its designated system. Programmable register 560 may 
be accessible to the user for programming through one or more input/ output 
(I/O) pins (not shown) of the packaged CAM device. The user may establish a 
connection to the register through such programming I/O pins and thereby 
program the register through hardware or software means. 
[0056] A particular delay element may be selected based upon an 

anticipated clock skew for a particular integrated circuit. The clock skew for an 
integrated circuit operating at a particular frequency, or frequency range, may be 
anticipated based on, for examples, calculations, simulations, empirical data from 
prototypes units, and the like. 
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[0057] Figure 8 is an exemplary table of the programming values for 

selecting a particular delay element based on the frequency (or a range of 
frequencies) of a reference clock signal. In one embodiment, for example, it may 
be determined that a particular CAM integrated circuit design operating at 200 
MHz is anticipated to have a clock skew of 4.5 ns. As such, one of the delay 
elements (e.g., delay element 579 of Figure 5) may be designed to provide a 3 ns 
delay. Where, for example, a 4 bit register is used, a user may program 
programmable register 560 of Figure 5 with a 0000 binary pattern which, when 
decoded by decoder 555 of Figure 5, causes multiplexer 550 of Figure 5 to select 
the internal clock signal generated by delay element 579. 
[0058] In one embodiment, the programmable delay circuit 515 may be 
programmed, or reprogrammed, to select a different delayed clock signal to 
accommodate a different frequency of operating the reference clock signal. For 
example, if the reference clock signal is desired to be run at 200 MHz (i.e., 5 ns 
cycle) and the clock skew and setup time, for a follow on device, are anticipated 
to be 4.5 ns and 0.5 ns, respectively, then a 3 ns delay path could be selected to 
satisfy the setup time of the follow on device, as illustrated by Figure 9. 
Correspondingly, programmable delay circuit 515 may be programmed to 
provide a 3 ns delay for the internal clock signal. If the same integrated circuit 
were then used with a reference clock signal running at 333 MHz (i.e., an 3 ns 
cycle) then (with the same assumptions of clock skew and setup time), a 3 ns 
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delay would generate an internal clock signal with a trigger edge beyond that of 
the 3 ns cycle, which would not allow for sufficient setup time for a follow on 
device to use a succeeding clock edge of the CLK. A user could program, or 
reprogram, the programmable register 560 to select, continuing the previous 
example, a 2 ns delay for the internal clock signal if the reference clock was to be 
run at 333 MHz. In this manner, the appropriate clock signal input to data 
output skew may be generated across various frequencies of the reference clock 
signal. 

[0059] The selection of a particular time period of delay for a given 
frequency provides an advantage over prior art delay methods using a 
PLL/DLL. As previously discussed, the internal clock signal generated by a 
PLL/DLL has a fixed amount of delay with respect to the reference clock signal 
and the phase shift of the internal clock signal is locked to the frequency of the 
reference clock signal. Although, the fixed amount of delay may be adjusted, 
such adjustment may only be made as a discrete percentage of the total 
maximum allowable delay, thereby limiting the versatility of PLL/DLL. 
[0060] Figure 6 illustrates another embodiment of an integrated circuit 
having multiple programmable delay circuits. Integrated circuit 620 may be 
CAM device 420 and programmable delay circuit 615 may be the programmable 
delay circuit 415 of Figure 4. Integrated circuit 620 may include multiple 
programmable delay circuits to refine the selection of an internal clock signal 
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having a desired delay with respect to a reference clock signal In the illustrated 
embodiment of Figure 6, for example, integrated circuit 620 may have a first 
programmable delay circuit 615 and a second programmable delay circuit 616 
coupled between the clock input pad 631 and output register 625. The second 
programmable delay circuitry 616 may be used to fine tune the desired delay for 
factors that may vary the anticipated clock skew for a given frequency, for 
examples, process and temperature variations. The second programmable delay 
circuitry 616 may also be used to reduce the total number of delay elements 
needed by using combinations of delay elements rather than a separate delay 
element for each delay path. 

[0061] In one embodiment, each of programmable delay circuits 615 and 

616 may be similar to programmable delay circuit 515 of Figure 5A. In such an 
embodiment, second programmable delay circuit 616 includes its own 
programmable register to receive the fine tuning time period information to 
select among the various delay elements in delay circuit 616. Alternatively, 
programmable delay circuits 615 and 616 may share a common programmable 
register with different bit positions of the common programmable register 
coupled to respective decoders in programmable delay circuits 615 and 616. 
[0062] The use of programmable delay circuits as discussed above may 

enable integrated circuits and, in particular, CAM devices to operate at higher 
frequencies than would otherwise be possible. By using a programmable 
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register to select from among clock signals that are phase shifted by delay 
elements, a delayed internal clock signal, relative to a reference clock signal, may 
be generated without the use of feedback circuit as with prior art solutions. This 
may provide for the generation of a more precise delayed internal clock signal 
across various frequencies of operation of the reference clock signal. 
[0063] The programmable delay circuits discussed herein may be 

particularly useful in CAM devices (e.g., CAM device 420) that may typically be 
more noisy than other types of memory devices (e.g., a SRAM). For example, 
during a compare operation, the entire CAM array 402 is active, thereby 
generating a large amount of current switching that would have an adverse 
effect on PLL and DLL if they were operated on the same power supplies as 
CAM array 402. The programmable delay circuits discussed herein may 
advantageously be operated with the same power supplies as the other digital 
components of the CAM device. 

[0064] One or more of the programmable delay circuits discussed above 

may also be used to clock other circuits in a CAM device including, for example, 
any of the circuits shown in CAM device 420 of Figure 4. 
[0065] In the foregoing specification, the invention has been described 

with reference to specific exemplary embodiments thereof. It will, however, be 
evident that various modifications and changes may be made thereto without 
departing from the broader spirit and scope of the invention as set forth in the 
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appended claims. The specification and drawings are, accordingly, to be 
regarded in an illustrative sense rather than a restrictive sense. 
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